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CROSS-REFERENCE TO 
RELATED APPLICATIONS 

This application is a continuation-in-part of application 
67/ 

Serial No;7209,009, filed June 20, 1988, (Attorney Docket w 
Past-059-Bj^ which is a continuation-in-part of application Seri- 
al No. '134,130*, filed December 17, 1987, (Attorney Docket g-^' 
PAST-059-A)a which is a continuation-in-part of application Seri- 
al No. '133,687, filed December 16, 1987, (Attorney Docket 
PAST-059j/^ The entire disclosure of each of these copending 
applications is relied upon and incorporated herein by reference. 

BACKGROUND OF THE INVENTION 

This invention relates to nucleotide sequences, polypeptides 
encoded by the nucleotide sequences, and to their use in diagnos- 
tic and pharmaceutical applications. 

Primary hepatocellular carcinoma (HCC) represents the most 
common cancer, especially in young men, in many parts of the 
world (as in China and in much of Asia and Africa) (reviewed in 
Tiollai^et al., 1985). Its etiology was investigated mostly by 
epidemiological studies, which revealed that, beyond some minor 
potential agents such as aflatoxin and sex steroids, hormones, 
Hepatitis B virus (HBV) chronic infection could account for a 
large fraction of liver cancers (Beasley and Hwang, 1984). 

HBV DNA has been found to be integrated in the genome of 
most cases of HCCs studied (Edman^et al^. , 1980; Brechot et al., 
1980; Chakraborty^et al. , 1980; Chen^et al. , 1982). Nonetheless 
the role of those sequences in liver oncogenesis remains unclear. 
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A single HBV integration in a HCC sample in a short liver 
cell sequence has been reported recently. The sequence was found 
to be homologous to steroid receptor genes and to the cellular 
proto-oncogene c-erbA (Dejean^at _§J. . , 1986). 

Ligand-dependent transcriptional activators, such as steroid 
or thyroid hormone receptors, have recently been cloned allowing 
rapid progress in the understanding of their mechanism of action. 
Nevertheless, there exists a need in the art for the identifica- 
tion of transcripts that may encode for activational elements, 
such as nuclear surface receptors, that may play a role in 
hepatocellular carcinoma. Such findings would aid in identifying 
corresponding transcripts in susceptible individuals. In addi- 
tion, identification of transcripts could aid in elucidating the 
mechanisms by which HCC occurs. 

Retinoids, a class of compounds including retinol (vitamin 
A), retinoic acid (RA) , and a series of natural and synthetic 
derivatives, exhibit striking effects on cell proliferation, dif- 
ferentiation, and pattern formation during development 
(Strickland and Mahdavi, 1978; Breitman et al., 1980; Roberts and 
Sporn, 1984; Thaller and Eichele, 1987). Until recently, the mo- 
lecular mechanism by which these compounds exert such potent 
effects was unknown, although retinoids were thought to modify 
their target cells through a specific receptor. 

Except for the role of retinoids in vision, their mechanism 
of action is not well understood at the molecular level. Several 
possible mechanisms have been suggested. One hypothesis proposes 
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that retinoids are needed to serve as the lipid portion of 
glycolipid intermediates involved in certain, specific 
glycosylat ion reactions. Another mechanism, which may account 
for the various effects of retinoids on target cells, is that 
they alter genomic expression in such cells. It has been sug- 
gested that retinoids may act in a manner analogous to that of 
the steroid hormones and that the intracellular binding proteins 
(cellular retinal-binding and retinoic acid-binding protein) play 
a critical part in facilitating the interaction of retinoids with 
binding sites in the cell nucleus. 

For example, the observation that the RA-induced differenti- 
ation of murine F9 embryonal carcinoma cells is accompanied by 
the activation of specific genes has led to the proposal that RA, 
like the steroid and thyroid hormones, could exert its transcrip- 
tional control by binding to a nuclear receptor (Roberts and 
Spron, 1984). However, the biochemical characterization of this 
receptor had been hampered by high affinity RA-binding sites cor- 
responding to the cellular retinoic acid binding protein (CRABP) , 
which is thought to be a cytoplasmic shuttle for RA (Chytil and 
Ong, 1984). 

In any event, retinoids are currently of interest in derma- 
tology. The search for new retinoids has identified a number of 
compounds with a greatly increased therapeutic index as compared 
with naturally occurring retinoids. Extensive clinical testing 
of two of these retinoids, 13- cis -ret inoic acid and the aromatic 
analog etretinate, has lead to their clinical use in dermatology. 
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In addition, several lines of evidence suggest that important re- 
lations exist between retinoids and cancer. A number of major 
diseases, in addition to cancer, are characterized by excessive 
proliferation of cells, often with excessive accumulation of 
extracellular matrix material. These diseases include rheumatoid 
arthritis, psoriasis, idiopathic pulmonary fibrosis, scleroderma, 
and cirrhosis of the liver, as well as the disease process 
atherosclerosis. The possibility exists that retinoids, which 
can influence cell differentiation and proliferation, may be of 
therapeutic value in some of these proliferative diseases. There 
exists a need in the art for reagents and methods for carrying 
out studies of receptor expression and effector function to 
determine whether candidate drugs are agonists or antagonists of 
retinoid activity in biological systems. 

There also exists a need in the art for identification of 
retinoic acid receptors and for sources of retinoic acid 
receptors in highly purified form. The availability of the 
purified receptor would make it possible to assay fluids for 
agonists and antagonists of the receptor. 
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SUMMARY OF THE INVENTION 
This invention aids in fulfilling these needs in the art. 
More particularly, this invention provides a cloned DNA sequence 
encoding i^r a polypeptide of a newly identified cellular gene, 
which has been named hap. The DNA sequence has the formula shown 
in Figure 2. More particularly, the sequence comprises: 
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^¥ €TTTOACTGTA i rGGATCTTCTGTCAGTG A€ ffCCTGGGCAAATeCTGATTCTACACTGCGA QT€€ 



GTCTTCCTGCATGCTCCAGGAGAAAGCTCTCAAAGCATGCTTCAGTGGATTGACCCAAACCGAATG 
GCAGCATCGGCACACTGCTCAATCAATTGAAACACAGAGCACCAGCTCTGAGGAACTCGTCCCAAG 
eeee CCATCTCCACTTCCTCCCCCTGG AQTOATCAAACCCTGCTTCGTCTGCCAGGACAAATCATC 
AOGGTACCACTATGGGGTCAGCGCCTGTGAGGGATGAAGGGCTTTTTCCGCAGAAGTATTCAGAAG 
AATATGATTTACACTTGTCACCGAGATAAGAACTGTGTTATTAATAAAGTCACCAGGAATCGATGC 
CAATACTGTCGACTCCAGAAGTGCTTTGAAGTGGGAATGTCCAAAGAATCTGTCAGGAATGACAGG 
AACAAGAAAAAGAAGGAGACTTCGAAGCAAGAATGCACAGAGAGCTATGAAATGACAGCTGAGTTG 
GAee ATC T CACAGAGAAGATCCGAA - AAGCTCftCCAGGAx\AC r rTTCCCTTCACTCTCGCAGCTGGG l P 
AAATACACCACGAATTCCAGTGCTGACCATCGAGTCCGACTGGACCTGGGCCTCTGGGACAAATTC 
AGTGAACTGGCCACCAAGTGCATTATTAAGATCGTGGAGTTTGCTAAACGTCTGCCTGGTTTCACT 
GGCTTGACCATCGCAGACCAAATTACCCTGCTGAAGGCCGCCTGCCTGGACATCCTGATTCTTAGA 
ATTTGCACCAGGTATACCCCAGAACAAGACACCATGACTTTCTCAGACGGCCTTACCCTAAATCGA 
ACTCAGATGCACAATGCTGGATTTGGTCCTCTGACTGACCTTGTGTTCACCTTTGCCAACCAGCTC 
CTGCCTTTGGAAATGGATGACACAGAAACAGGCCTTCTCAGTGCCATCTGCTTAATCTGTGGAGAC 
CGCCAGGACCTTGAGGAACCGACAAAAGTAGATAAGCTACAAGAACCATTGCTGGAAGCACTAAAA 
ATTTATATCAGAAAAAGACGACCCAGCAAGCCTCACATGTTTCCAAAGATCTTAATGAAAATCACA 
GATCTCCGTAGCATCAGTGCTAAAGGTGCAGAGCGTGTAATTACCTTGAAAATGGAAATTCCTGGA 
TCAATGCCACCTCTCATTCAAGAAATGATGGAGAATTCTGAAGGACATGAACCCTTGACCCCAAGT 
TCAAGTGGGAACACAGCAGAGCACAGTCCTAGCATCTCACCCAGCTCAGTGGAAAACAGTGGGGTC 
AGTCAGTCACCACTCGTGCAATAA . 

The invention also covers variants and fragments of the DNA se- 
quence. The DNA sequence is in a purified form. 

This invention also provides a probe consisting of a ra- 
dionuclide bonded to the DNA sequence of the invention. 
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In addition, this invention provides a hybrid duplex mole- 
cule consisting essentially of the DNA sequence of the invention 
hydrogen bonded to a nucleotide sequence of complementary base 
sequence, such as DNA or RNA. 

Further, this invention provides a polypeptide comprising an 
amino acid sequence of hap protein, wherein the polypepetide con- 
tains the amino acid sequence shown in Figure 2. More particu- 
larly, the amino acid sequence comprises: 
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MetPheAspCysMetAspValLeuSerValSerProGlyGlnlleLeuAspPheTyrThrAla 

SerProSerSerCysMetLeuGlnGluLysAlaLeuLysAlaCysPheSerGlyLeuThrGln 

ThrGluTrpGlnHisArgHisThrAlaGlnSerlleGluThrGlnSerThrSerSerGluGlu 

LeuValProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 

GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGlyCysLysGlyPhePhe 

ArgArgSerlleGlnLysAsnMetlleTyrThrCysHisArgAspLysAsnCysVallleAsn 

LysValThrArgAsnArgCysGlnTyrCysArgLeuGlnLysCysPheGluValGlyMetSer 

LysGluSerValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCysThr 

GluSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLysI leArgLysAlaHisGln 

GluThrPheProSerLeuCysGlnLeuGlyLysTyrThrThrAsnSerSerAlaAspHisArg 

ValArgLeuAspLeuGlyLeuTrpAspLysPheSerGluLeuAlaThrLysCysIlel leLys 

IleValGluPheAlaLysArgLeuProGlyPheThrGlyLeuThrl leAlaAspGlnl leThr 

LeuLeuLysAlaAlaCysLeuAspIleLeuI leLeuArgl leCysThrArgTyrThrProGlu 

GlnAspThrMetThrPheSerAspGlyLeuThrLeuAsnArgThrGlnMetHisAsnAlaGly 

PheGlyProLeuThrAspLeuValPheThrPheAlaAsnGlnLeuLeuProLeuGluMetAsp 

AspThrGluThrGlyLeuLeuSerAlal leCysLeuI leCysGlyAspArgGlnAspLeuGlu 

GluProThrLysValAspLysLeuGlnGluProLeuLeuGluAlaLeuLysI leTyrl leArg 



-7- 



LysArgArgProSerLysProHisMetPheProLysI leLeuMetLys I leThrAspLeuArg 
SerlleSerAlaLysGlyAlaGluArgVallleThrLeuLysMetGluIleProGlySerMet 
ProProLeuIleGlnGluMetMetGluAsnSerGluGlyHisGluProLeuThrProSerSer 
SerGlyAsnThrAlaGluHisSerProSerlleSerProSerSerValGluAsnSerGlyVal 

SerGlnSerProLeuValGln. 
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The invention also covers serotypic variants of the polypeptide 
and fragments of the polypeptide. The polypeptide is free from 
human serum proteins, virus, viral proteins, human tissue, and 
human tissue components. Preferably, the polypeptide is free 
from human, blood-derived protein. 

The hap protein ( hap for hepatoma) exhibits strong homology 
with the human retinoic acid receptor (RAR) de Thfe, H. , Marchio, 
A., Tiollais, P. & Dejean, A. Nature 330, 667-670 (1987), 
Petkovich, M. , Brand, N.J., Krust, A. & Chambon, P. Nature 330, 
444-450 (1987), a receptor has been recently characterized 
Petkovich, M. , Brand, N.J., Krust, A. & Chambon, P. Nature 330, 
444-450 (1987), Giguere, V., Ong , E.S., Segui, P. & Evans, R. M. 
Nature 330, 624-629 (1987). To test the possibility that the hajs 
protein might also be a retinoid receptor, a chimaeric receptor 
was created by replacing the putative DNA binding domain of hap 
with that of the human oestrogen receptor (ER). The resulting 
hap -ER chimaera was then tested for its ability to trans-activate 
an oestrogen-responsive reporter gene ^vit-tk-CAT) in the pres- 
ence of possible receptor ligands. It was discovered that 
retinoic acid (RA) at physiological concentrations is effective 
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in inducing the expression of this reporter gene by the hap -ER 
chimaeric receptor. See Nature, 332:850-853 (1988). This demon- 
strates the existence of two human retinoic acid receptors desig- 
nated RAR-a and RAR- 8 . 

More particularly, it has been discovered that the hap pro- 
tein is a second retinoic acid receptor. Thus, the expression 
" hap protein" is used interchangeably herein with the abbrevia- 
tion " RAR- 8" for the second human retinoic acid receptor. 

Also, this invention provides a process for selecting a 
nucleotide sequence coding for hap protein or a portion thereof 
from a group of nucleotide sequences comprising the step of 
determining which of the nucleotide sequences hybridizes to a DNA 
sequence of the invention. The nucleotide sequence can be a DNA 
sequence or an RNA sequence. The process can include the step of 
detecting a label on the nucleotide sequence. 

Still further, this invention provides a recombinant vector 
comprising Iambda-NM1149 having an Eco RI restriction endonuclease 
site into which has been inserted the DNA sequence of the inven- 
tion. The invention also provides plasmid pCOD20, which com- 
prises the DNA sequence of the invention. 

This invention provides an EL coli bacterial culture in a 
purified form, wherein the culture comprises E^ coli cells con- 
taining DNA, wherein a portion of the DNA comprises the DNA se- 
quence of the invention. Preferably, the EL col i is stain TG-1. 
In addition, this invention provides a method of using the 
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purified retinoic acid receptor of the invention for assaying a 
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medi.um, such as a fluid, for the presence of an agonist or antag- 
onist of the ^^^^^J In genera1 ' the method comprises provid- 
ing a known concn e- t ra - t - ion of a ^^^ ^^^ou^ ) receptor of the 
invention, incubating the receptor with a ligand of the receptor 
and a suspected agonist or antagonist under conditions sufficient 
to form a 1 igand-receptor complex, and assaying for 
ligand-receptor complex or for free ligand or for non-complex 
receptor. The assay can be conveniently carried out using la- 
belled reagents as more fully described hereinafter, and conven- 
tional techniques based on nucleic acid hybridization, 




immunochemistry , and ^hromotift graph :, such as TLC, HPLC, and affin- 
ity chromotography . 

In another method of the invention, a medium is assayed for 
stimulation of A t raoncr ipt ion of the RAR-B gene or translation of 
the gene by an agonist or antagonist. For example, B-receptor 
binding retinoids can bg^screened in this manner. 

JEF DESCRIPTION OF THE DRAWINGS 
This invention will be described in greater detail with ref- 
erence to the drawings in which 

Fig. 1 is a restriction map of human liver hap cDNA; 
Fig. 2 is the nucleotide sequence of human liver hap cDNA 
and a predicted amino acid sequence of human liver hap cDNA; 

Fig. 3 depicts the distribution of hap mRNA in different 
tissues as determined by Northern blot analysis; 

Fig. 4 depicts the distribution of hap mRNA in HCC and HCC 
derived cell lines as determined by Northern blot analysis; 
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Fig. 5 is a fluorograph of hap polypeptide synthesized in 
vitro and isolated on SDS-polyacrylamide gel; 

Fig. 6 shows the alignment of hap translated amino acid se- 
quence with several known sequences for thyroid and steroid hor- 
mone receptors ; 

Fig. 7 is a schematic alignment of similar regions identi- 
fied as A/B, C, D, and E of the amino acid sequences of Fig. 6; 

Fig. 8 depicts hap related genes in vertebrates (A) and in 
humans (B and C) as determined by Southern blot analysis; 

Fig. 9 shows the tissue distribution of RAR a and 8 tran- 
scripts; 

Fig. 10 shows the dose- and time-response of RAR a and B 
transcripts after retinoic acid treatment of PLC/PRF/5 cells; 

Fig. 11 shows the effect of RNA and protein synthesis inhib- 
itors on the levels of RAR a and B mRNAs ; 

Fig. 12 reports the results of nuclear run-on analysis of 
RAR B gene transcription after RA treatment; and 

Fig. 13 reports the results of nuclear run-on analysis of 
RAR B transcription in two hepatoma cell-lines; 

Fig. 14 shows the resulting kinetic analysis of RAR mRNA 
degradation; 

Fig. 15 depicts a nucleotide sequence analysis extending a X 
13 RAR-B by 72 bp; and 



-11- 



LAW OFFICES 

Finnegan. Henderson 
Farabow. Garrett 
S Dunner 

1775 K STREET. N. W. 
WASHINGTON, D. C.2000 6 
(202) 293-6850 




Fig. 16 is a complete restriction map of a cloned 
Hindlll-BamHI genomic DNA insert containing the nucleotide se- 



quence of Fig. 15. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

A. IDENTIFICATION OF A PROTEIN, NAMED hap PROTEIN, HAVING 
DNA-BINDING AND LIGAND-BINDING DOMAINS, AND 
IDENTIFICATION OF THE DNA SEQUENCE ENCODING hap PROTEIN 

As previously noted, ligand-dependent transcriptional 
activators, such as steroid or thyroid hormone receptors, have 
recently been cloned. The primary structure and expression of a 
new gene, hap , closely related to steroid or thyroid hormone 
receptor genes have now been discovered. The hap product 
exhibits two regions highly homologous to the conserved DNA- and 
hormone-binding domains of previously cloned receptors. 

More particularly, the cloning of a cDNA corresponding to a 
novel steroid/thyroid hormone receptor-related gene has been 
achieved. The cDNA was recovered from a human liver cDNA library 
using a labelled cellular DNA fragment previously isolated from a 
liver tumor. The fragment contained a 147 bp putative exon in 
which HBV inserted. The sequence of this cellular gene, which is 
referred to herein as hap for hepatoma, reveals various structur- 
al features characteristic of c-erbA /steroid receptors (Dejear^^i 
t al, t 1986). The receptor-related protein is likely to be a novel 
member of the superfamily of transcriptional regulatory proteins 
that includes the thyroid and steroid hormone receptors. 

It has been discovered that the hap gene is transcribed at 
low level in most human tissues, but the gene is overexpressed in 
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prostate and kidney. Moreover, six out of seven hepatoma and 
hepatoma-der ived cell lines express a small hap transcript, which 



in all non-hepatic tissues tested. Altered expression of hap may 
be involved in liver oncogenesis. 

These findings, as well as other discoveries relating to 
this invention, will now be described in detail. 

A.l Cloning and Sequencing of a Hap cDNA 

A human liver cDNA library was screened using a nick- 
translated 350 bp Eco RI genomic fragment (MNT probe) previously 
cloned from a hepatoma sample. The fragment contained the puta- 
tive 147 bp cellular exon in which HBV integration took place 
(Dejean^ei £X. 9 1986). 

Four positive 3 f co-terminal clones were isolated from the 2 
x 10^ plaques screened and the restriction maps were deduced for 
each of the cDNA clone Eco RI inserts. The longest one was iden- 
tified lambda-13. The restriction map of lambda-13 is shown in 
Fig. 1. 

Referring to Fig. 1, the insert of clone lambda-13 is nearly 
a full-length cDNA for the hap gene. Noncoding sequences (lines) 
and coding sequences (boxed portion) are indicated. Restriction 
sites are: 



is undetectable in normal adult and fetal livers, but is present 
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K Kpn l 
P PvuII 
B BamHI 
H Hindi II. 

The lambda-13 clone was subjected to nucleotide sequence 
analysis. The nucleotide sequence is shown in Fig. 2. The 
nucleotide sequence of the hap cDNA is presented in the 5 1 to 3 1 
orientation. The numbers on the right refer to the position of 
the nucleotides. Numbers above the deduced translated sequence 
indicate amino acid residues. The four short open reading frames 
in the 5 f untranslated region are underlined. Adenosine residues 
(20) are found at the 3' end of lambda-13. The putative poly- 
adenylation signal site ( AATAAA) is boxed. The region homologous 
to the DNA-binding domain of known thyroid/steroid hormone 
receptors is indicated by horizontal arrows. The exon, previous- 
ly cloned from a HCC sample genomic DNA library and in which HBV 

integration took place, is bracketed. 

V V 

This invention^of coursej includes variants of the nucleotide 
sequence shown in Fig. 2 encoding hap protein or a serotypic 
variant of hap protein exhibiting the same immunological re- 
activity as hap protein. 

The DNA sequence of the invention is in a purified form. 
Generally, the DNA sequence is free of human serum proteins, 
viral proteins, and nucleotide sequences encoding these proteins. 
The DNA sequence of the invention can also be free of human tis- 
sue . 
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The DNA sequence of the invention can be used as a probe for 
the detection of a nucleotide sequence in a biological material, 
such as tissue or body fluids. The polynucleotide probe can be 
labeled with an atom or inorganic radical, most commonly using a 
radionuclide, but also perhaps with a heavy metal. 

In some situations it is feasible to employ an antibody 
which will bind specifically to the probe hybridized to a single 
stranded DNA or RNA. In this instance, the antibody can be la- 
beled to allow for detection. The same types of labels which are 
used for the probe can also be bound to the antibody in accor- 
dance with known techniques. 

Conveniently, a radioactive label can be employed. Radioac- 
tive labels include 32 P, 3 H, 14 C, or the like. Any radioactive 
label can be employed, which provides for an adequate signal and 
has sufficient half-life. Other labels include ligands, that can 
serve as a specific binding member to a labeled antibody, 
fluorescers, chemi luminescers , enzymes, antibodies which can 
serve as a specific binding pair member for a labeled ligand, and 
the like. The choice of the label will be governed by the effect 
of the label on the rate of hybridization and binding of the 
probe to the DNA or RNA. It will be necessary that the label 
provide sufficient sensitivity to detect the amount of DNA or RNA 
available for hybridization. 

Ligands and anti-ligands can be varied widely. Where a 
ligand has a natural receptor, namely ligands such as biotin, 
thyroxine, and Cortisol, these ligands can be used in conjunction 
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with labeled naturally occurring receptors. Alternatively, any 
compound can be used, either haptenic or antigenic, in combina- 
tions with an antibody. 

Enzymes of interest as labels are hydrolases, particularly 
esterases and glycosidases f or oxidoreductases , particularly 
peroxidases. Fluorescent compounds include fluorescein and its 
derivatives, rhodamine and its derivatives, dansyl, 
umbellif erone, etc. Chemi luminescers include luciferin and 
luminol . 

A. 2. Amino Acid Sequence of Protein Encoded bv hap Gene 

Based upon the sequence of the hap cDNA, the amino acid se- 
quence of the protein encoded by hap gene was determined. With 
reference to Fig. 2, the deduced amino acid sequence encoded by 
the gene reveals a long open reading frame of 448 amino acids 
corresponding to a predicted polypeptide of relative molecular 
mass 51 , 000 . 

A putative initiator methionine codon and an in-frame 
terminator codon are positioned respectively at nucleotides 322 
and 1666 in the sequence (Fig. 2). However, two other methionine 
codons are found 4 and 26 triplets downstream from the first ATG 
making the determination of the initiation site equivocal. 

The coding sequence is preceded by a 5' region of at least 
321 nucleotides which contains four short open reading frames 
delineated by initiator and stop codons (Fig. 2). Translation 
usually starts, in eukaryotes, at the 5' most ATG triplet, but 
the finding of open reading frames in the 5' 'untranslated' 



-16- 



LAW OFFICES 

Finnecan, Henderson 
Farabow. Garrett 
8 Dunner 

1775 K STREET. N. W. 
WASHINGTON. D. C.20006 
1202) 293-6BSO 



region is not unprecedented (Kozak, 1986). It is not known yet 
whether those sequences are used for translation and exert any 
function in the cell. 

In the 3' untranslated region, 1326 nucleotides long, no 
long open reading frame is present. A putative polyadenylat ion 
signal (A AT AAA) is found 19 bp upstream from the polyadenylat ion 
site. 

It will be understood that the present invention is intended 
to encompass the protein encoded by the hap gene, i.e. hap pro- 
tein, and fragments thereof in highly purified form. The hap 
protein can be expressed in a suitable host containing the DNA 
sequence of the invention. This invention also includes poly- 
peptides in which all or a portion of the binding site of hap 
protein is linked to a larger carrier molecule, such as a poly- 
peptide or a protein, and in which the resulting product exhibits 
specific binding in vivo and in vitro . In this case, the poly- 
peptide can be smaller or larger than the proteinaceous binding 
site of the protein of the invention. 

It will be understood that the polypeptide of the invention 
encompasses molecules having equivalent peptide sequences. By 
this it is meant that peptide sequences need not be identical. 
Variations can be attributable to local mutations involving one 
or more amino acids not substantially affecting the binding 
capacity of the polypeptide. Variations can also be attributable 
to structural modifications that do not substantially affect 
binding capacity. Thus, for example, this invention is intended 
to cover serotypic variants of hap protein. 
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Three particular regions of hag gene are of interest. Two 
of them are located in the D region (amino acids comprised 



are 

147 



between 4-6- and 196), which have been shown by the inventors to be 
highly immunogenic. Amino acids 46-196 have the sequence: 



GlnHisArgHisThrAlaGlnSerlleGluThrGlnSerThrSerSerGluGlu 

LeuValProSerProProSerProLeuProProProArgValTyrLysProCysPheValCys 

GlnAspLysSerSerGlyTyrHisTyrGlyValSerAlaCysGluGlyCysLysGlyPhePhe 

ArgArgSer I leGlnLysAsnMet I leTyrThrCysHisArgAspLysAsnCysVal I leAsn 
LysValThrArgAsnArgCysGlnTyrCysArgLeuGlnLysCysPheGluValGlyMetSer 
LysGluSerValArgAsnAspArgAsnLysLysLysLysGluThrSerLysGlnGluCysThr 
GluSerTyrGluMetThrAlaGluLeuAspAspLeuThrGluLysIleArgLysAlaHisGln 
GluThrPheProSerLeuCys . 
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One peptide of interest in the D region is comprised of 
acids 151-167 and has the sequence: 

ValArgAsnAspAsgAsnLysLysLysLysGluThrSerLysGlnGluCys . 

A second peptide in the D region is located between amino 
acids 175 and 185. This peptide has the amino acid sequence: 

AlaGluLeuAspAspLeuThrGluLys I leArg . ( 

f Another peptide of interest is located at the^end of j ^% ^ 
(E Wvusi5\ y^<^ 
-rjrefh between amino acids 440 and 448. This peptide has the amino 

acid sequence: 

GlyValSerGlnSerProLeuValGln. 

Other peptides having formulas derived from the nucleotide 
sequence of hap gene can be used as reagents, particularly to 
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obtain antibodies for diagnostic purposes, as defined here- 
inabove. 

The most favorable region is found in the hinge region 
(amino acids 147 to 193). This region includes amino acids 150 
to 170, corresponding to the following criteria: 

The region includes very hydrophilic sequences, namely, 
the sequences 154-160 (No. 1/Hopp); 155-161 (No. 1/Doolittle) ; 

155- 159 (No. 1/acrophilic) . 

The region includes a peptide, namely, amino acids 

156- 162, No. 5 in mobility. 

The polypeptide of this region has a low probability of 
adopting a structure in the form of a folded sheet or a helix, 
but, in contrast, a good probability of an omega loop and one 
beta-turn, very marked in the Asp-Arg-Asn-Lys tetrapept ide . 

The region does not have a potential site of 
N-glycosylat ion nearby; several suggestions in this zone can be 
made : 

Val-Arg-Asn-Asp-Arg-Asn-Lys-Lys-Lys-Lys-Glu-Thr-Ser-Lys- 

Gln-Glu-Cys (peptide 1); 
Peptide 1 corresponds to amino acids 151-167 and permits finding 
Cys 167, which is present in the sequence and enables attachment 
to a carrier (it will be noted that this peptide corresponds to a 
consensus sequence of phosphorylation by kinase A). 

Peptide 1 can be shortened by N-turn while preserving the 
beta-turn and by C-turn while replacing Ser by Cys to maintain 
the possibility of coupling at this level: 
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Asn-Asp-Arg-Asn-Lys-Lys-Lys-Lys-Glu-Thr-€^ (peptide 2). 
Peptide 2 is also favorable, but is clearly less favorable than 
Peptide 1 from the viewpoint of hydrophilicity as of its higher 
potential for spatial organization (probably as amphiphilic 
hel ix ) . 

Finally, it will be noted that the C-terminal end consti- 
tutes a preferred region as a function of its mobility, but it 
nevertheless remains very hydrophobic. For example, the follow- 
ing peptide is contemplated: 

A €y&-Gly-Val-Ser-Gln-Ser-Pro-Leu-Val-Gln (peptide 3). 

Peptide 3 can be fixed in a specific manner by an N-terminal Cys 
in such a way as to reproduce its aspect on the protein. 

The nucleotide sequences of hap gene encoding those peptides 
are as follows: 

For peptide 1: 

GTCAGGAATGACAGGAACAAGAAAAAGAAGGAGACTTCGAAGCAAGAATGC . 

For peptide 2: 
s \ &G66JCAtrf 

£ y ^GQG 'TCA C T eAGTCACCACTCGTGCAA . 
For peptide 3: 

AATGACAGGAACAAGAAAAAGAAGGAGACT . 
For peptide of amino acids 175-185: 
QjAFy GCTCAGTTGG ACCATCTCACAGAGAAGATTCCGA . 

/ The polypeptides of the invention can be injected in mice, 
and monoclonal and polyclonal antibodies can be obtained. 
Classical methods can be used for the preparation of hybridomas. 
The antibodies can be used to quantify the amount of human 
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receptors produced by patients in order to correlate the patho- 
logical states of illness and quantity of receptors or the 
absence of such receptors. 

Epi tope-bearing polypeptides, particularly those whose 
N-terminal and C-terminal amino acids are free, are accessible by 
chemical synthesis using techniques well known in the chemistry 
of proteins. For example, the synthesis of peptides in homoge- 
neous solution and in solid phase is well known. 

In this respect, recourse may be had to the solid phase syn- 
thesis of peptides using the method of Merrifield, J. Am. Chem. 
Assoc. 85, 2149-2154 (1964) or the method of synthesis in homoge- 
neous solution described by Houbenweyl in the work entitled 
"Methoden der Organischen Chemie" (Methods of Organic Chemistry), 
edited by E. WUNSCH, vol. 15-1 and II, THIEME, Stuttgart (1974). 

This method of synthesis consists of successively condensing 
either the successive amino acid in pairs in the appropriate 
order, or successive peptide fragments previously available or 
formed and containing already several aminoacyl residues in the 
appropriate order, respectively. Except for the carboxyl and 
amino groups which will be engaged in the formation of the 
peptide bonds, care must be taken to protect beforehand all other 
reactive groups borne by these aminoacyl groups and fragments. 
However, prior to the formation of the peptide bonds, the car- 
boxyl groups are advantageously activated according to methods 
well known in the synthesis of peptides. Alternatively, recourse 
may be had to coupling reactions bringing into play conventional 
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coupling reagents, for instance of the carbodi imide type such as 
l-ethyl-3-(3-dimethyl-aminopropyl)-carbodiimide. When the amino 
acid group carries an additional amino group (e.g. lysine) or 
another acid function (e.g. glutamic acid), these groups may be 
protected by carbobenzoxy or t-butyloxycarbonyl groups, as 
regards the amino groups, or by t-butylester groups, as regards 
the carboxylic groups. Similar procedures are available for the 
protection of other reactive groups. For example, SH group (e.g. 
in cysteine) can be protected by an acetamidomethyl or 
paramethoxybenzyl group. 

In the case of progressive synthesis, amino acid by amino 
acid, the synthesis preferably starts by the condensation of the 
C-terminal amino acid with the amino acid which corresponds to 
the neighboring aminoacyl group in the desired sequence and so 
on, step by step, up to the N-terminal amino acid. Another pre- 
ferred technique that can be relied upon is that described by 
R.D. Merrifield in "Solid Phase Peptide Synthesis" (J. Am. Chem. 
Soc, 45, 2149-2154). In accordance with the Merrifield process, 
the first C-terminal amino acid of the chain is fixed to a suit- 
able porous polymeric resin by means of its carboxylic group, the 
amino group of said amino acid then being protected, for example, 
by a t-butyloxycarbonyl group. 

When the first C-terminal amino acid is thus fixed to the 
resin, the protective group of the amino group is removed by 
washing the resin with an acid, i.e. trif luoroacet ic acid when 
the protective group of the amino group is a t-butyloxycarbonyl 
group. 
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Then the carboxylic group of the second amino acid, which is 
to provide the second aminoacyl group of the desired peptide se- 
quence, is coupled to the deprotected amino group of the 
C-terminal amino acid fixed to the resin. Preferably, the car- 
boxyl group of this second amino acid has been activated, for 
example by dicyclohexylcarbodi imide , while its amino group has 
been protected, for example by a t-butyloxycarbonyl group. The 
first part of the desired peptide chain, which comprises the 
first two amino acids, is thus obtained. As previously, the 
amino group is then deprotected, and one can further proceed with 
the fixing of the next aminoacyl group and so forth until the 
whole peptide sought is obtained. 

The protective groups of the different side groups, if any, 
of the peptide chain so formed can then be removed. The peptide 
sought can then be detached from the resin, for example, by means 
•of hydrofluoric acid, and finally recovered in pure form from the 
acid solution according to conventional procedures. 

Depending on the use to be made of the proteins of the 
invention, it may be desirable to label the proteins. Examples 
of suitable labels are radioactive labels, enzymatic labels, 
flourescent labels, chemi luminescent labels, or chromophores . 
The methods for labeling proteins of the invention do not differ 
in essence from those widely used for labeling immunoglobulin. 
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A. 3. Tissue Specific mRNA Distribution 

In order to study expression of the hap gene, Northern blot 
analysis was performed using MNT as a probe and poly(A)+ RNA 
extracted from various human tissues and cell lines. The results 
are shown in Figure 3. 

More particularly, Northern blot analyses were performed 



with poly(A)+ RNAs (15 ^ per lane) extracted from different 

A* 

human organs and cell lines. A control hybridization with a mouse 
beta-actin cDNA probe is shown below the hybridizations in Fig. 
3. Hap mRNA in different tissues is shown in Fig. 4A as follows: 

Lane a ovary 

Lane b uterus 

Lane c HBL 100 mammary cells 
Lane d adult spleen 
Lane e 18 weeks fetal spleen 
Lane f K562 

Lane g HL60 hematopoei t ic cell lines 

Lane h prostatic adenoma 

Lane 1 kidney 

Lane j adult liver 

Lane k 18 weeks fetal liver. 
Lanes a-k correspond to a one day exposure. 

Fig. 3 shows that two RNA species of 3 kb and +2.5 kb (the 
size of this smaller mRNA is slightly variable from one organ to 
another) were expressed at low abundance in ovary (lane a), 
uterus (lane b) , HBL 100 mammary cells (lane c), adult and fetal 
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spleen (lane d and e, respectively), and K562 and HL60 
hematopoeit ic cell lines (lanes f and g, respectively). 
Surprisingly, an approximately tenfold higher level of expression 
was detected in prostatic adenoma (lane h) and kidney (lane i). 
By contrast, a single mRNA of 3000 nucleotides, expressed at low 
levels, was present in poly(A)+ RNA from adult and fetal liver 
tissues (lanes j and k). Therefore, the cloned hap cDNA is like- 
ly to be a full-length copy of this transcript. 

The finding of two mRNA species overexpressed in prostate 
and kidney, as well as the presence of a single mRNA expressed at 
low level in adult and fetal livers show that hap expression is 
differentially regulated in those organs. This tissue specific 
expression provides some indication that prostate and kidney, as 
well as liver, could be key tissues and that hap functions in 
those cell types may differ. 

Fig. 4 shows hap mRNA in HCC and HCC derived cell-lines as 
follows : 

Lane a, normal liver (four days autoradiography); 

Lanes b, c, d: three HCC samples (Lane b, patient Ca; Lane 
c, patient Mo; Lane d, patient TCI); 

Lanes e, f, g: three HCC-derived cell lines (Lane e, 
PLC/PRF/5; Lane f, HEPG2 ; Lane g, HEP 3B) . 

The lanes b-g correspond to a one day exposure. Once again, a 
control hybridization with a normal beta-actin cDNA probe is 
shown below the hybridizations. 
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With reference to Fig. 4, the smaller 2.5 kb mRNA was 
undetectable, even after long exposure, in three adult and two 
fetal human livers analyzed (Fig. 4, Lane a). This differential 
expression in normal livers may suggest a distinct role of hap in 
this part icular t issue . 

Northern blot analysis of human HCCs and hepatoma cell lines 
showed almost constant alterations in hap transcription. There 
are two possible alternatives to explain this result. The 
smaller mRNA species can be simply expressed as a consequence of 
the cellular dedif f erentiat ion. The tumorous liver cell, having 
lost its differentiated characteristics, would behave as any 
other cell type and thus express the same 2 . 5 kb mRNA as found in 
non-hepatic cells. However, the inability to detect such a 
smaller transcript in fetal livers does not seem to favor this 
hypothesis. On the contrary, the presence of the smaller tran- 
script may have preceded the tumor igenes is events and would rath- 
er reflect a preneoplastic state. The presence of an inappropri- 
ately expressed hap protein, normally absent from normal 
hepatocytes, may have directly participated to the hepatocellular 
transformation. In this respect, the previous study reporting a 
HBV integration in the hap gene of a human HCC (Dejean et al.. , 
1986) strongly supports the idea that hap could be causatively 
involved in liver oncogenesis. Indeed, in this tumor, a chimeric 
gene between the viral pre-Sl gene and hap may have resulted in 
the over-expression of a truncated hap protein. At present, it 
is the one found in non-hepatic tissues. 
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A. 4 • Expression of hap in Hepatocellular Carcinoma 

Hap was first identified in a human primary liver cancer. 
Encouraged by this finding, poly(A)+ RNA from seven hepatoma and 
hepatoma-der ived cell lines were analyzed by Northern-blotting. 
Five of them contained integrated HBV DNA sequences. In addition 
to the 3 kb long mRNA found in normal adult and fetal liver, an 
additional +2.5 kb RNA species was observed, in equal or even 
greater amount, in three out of four HCC (Fig. 4, Lanes b, c, d) 
and in the PLC/PRF/5, HEPG2 and HEP3B hepatoma cell-lines (Lanes 
e, f, g). The size of the smaller transcript was variable from 
sample to sample. In addition, the two transcripts were strik- 
ing 1y ^^S£^4i^t t at least ten fold, in the PLC/PRF/5 cells. 

To test the possibility that the inappropriate expression of 
hap in those six tumors and tumorous cell-lines might be the con- 
sequence of a genomic DNA alteration, Southern-blotting of cellu- 
lar DNA was performed using, as two probes, the MNT fragment to- 
gether with a 1 kb Eco RI fragment corresponding to the 5* 
extremity of the cDNA insert (Fig. 2). No rearrangement and/or 
amplification was detected with any of these two probes which 
detect a different single exon (data not shown), suggesting that 
the hap gene was not altered at the genomic level. It is yet 
unknown whether the +2.5 kb mRNA, present in the liver tumorous 
samples and cell lines, corresponds to the same smaller tran- 
script as that found in non-hepatic tissues. However, its pres- 
ence in the liver seems to be clearly associated to the 
hepatocellular transformed state. 
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A, 5. Hormone-binding Assay 

Amino-acid homologies between the hap protein and the 
c-erbA /steroid receptors support the hypothesis that hap may be a 
receptor for a thyroid/steroid hormone-related ligand. The abil- 
ity to express functional receptors in vitro from cloned 
c-erbA /steroid receptor genes led to the use of an in vitro 
translation assay to identify a putative hap ligand. 

The coding region of hap was cloned into pTZ18 plasmid 
vector to allow i_n vitro transcription with the T7 RNA polymerase 
and subsequent translation in reticulocyte lysates. The results 
are shown in Fig. 5. More particularly, 35 S-methionine-labelled 
products synthesized using T7 polymerase-catalysed RNA tran- 
scripts were separated on a 12% SDS-polyacrylamide gel, which was 
f luorographed (DMSO-PPO) . The lanes in Fig. 5 are as follows: 

Lane a, pCOD 20 (sense RNA, 70 ng) 

Lane b, pCOD 20 (140 ng) 

Lane c, pCOD 14 (antisense RNA, 140 ng) . 

Figure 5 shows that the hap RNA directed the efficient syn- 
thesis of a major protein, with a 51 K relative molecular mass, 
consistent with the size predicted by the amino acid sequence 
(lanes a and b) , whereas the anti-sense RNA-prog rammed lysate 
gave negligible incorporation (lane c). 

Because c-erbA and hap colocalize on chromosome 3 and are 
more closely related according to their amino acid sequence, 
( 125 I)-T3 (triiodothyronine), -reverse T3 ( 3 , 3 ' , 5 ' -tr i iodo-- 
thyronine) and -T4 (thyroxine), were first tested for their 
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binding with the in vitro translated hap polypeptide. No specif- 
ic fixation with any of those three thyroid hormones could be 
detected. As a positive control, binding of a T3 was detected 
with nuclear extracts from HeLa cells. The results were negative 
as well when the experiment was repeated with (Jh) -ret inol , 
-retinoic acid, and -testosterone, which represent three putative 
ligands for hap whose receptors have not yet been cloned. 
Although it cannot excluded that hap may encode a hormone inde- 
pendent transcriptional activator, it is more likely that hap 
product, i.e. the hap protein, is a receptor for a presently 
unidentified hormone. 

A.6« Similarity of HAP Protein to 

Thyroid/Steroid Hormone Receptors 

The c-erbA gene product, recently identified as a receptor 
for thyroid hormone (Weinberger , ££ £<L, , 1986; Sap et al., 1986), 
as well as the steroid receptors, belong to a superfamily of reg- 
ulatory proteins, which consequently to their binding with spe- 
cific ligand, appear capable of activating the transcription of 
target genes (reviewed by Yamamoto, 1985). This activation seems 
to be the result of a specific binding of the hormone-receptor 
complex to high-affinity sites on chromatin. 

Comparative sequence analysis has been made between the fol- 
lowing different cloned steroid receptors: 

glucocorticoid receptor (GR) (Hollenberg^gi aJL . , 1985; 
Miesf eld et al. , 1986) ; 
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oestrogen receptor (ER) (Green^et al., 1986; Greene et al., 
1986); 

progesterone receptor (PR) (Conneel^ et j&L. , 1986; Loosfelt 
_et jl. , 1986) ; and 

thyroid hormone receptor ( c-erbA product) ( We inberger^ et_ 
al., 1986; Sap et al., 1986). 

Mutation analysis has also been carried out. (Kumar^et al. , 
1986; Hollenberg^ei al., 1987; Miesfeld et al., 1987). The 
results revealed the presence of two conserved regions repre- 
senting the putative DNA-binding and hormone-binding domains of 
those molecules. It has now been discovered that hap protein is 
homologous to the thyroid/steroid hormone receptors. 

More particularly, homology previously reported between the 
putative 147 bp cellular exon (bracketed in Fig. 2) and the 
c-erbA /steroid receptor genes led us to compare the entire hap 
predicted amino acid sequence with hGR, rPR, hER, and 
hc-erbA /thyroid hormone receptor. The five sequences have been 
aligned for maximal homology by the introduction of gaps. The 
results are depicted in Fig. 6. Specifically, the following 
nucleotide sequences were aligned after a computer alignment of 
pairs (Wilbur and Lipman, 1983): 

hap product , 

human placenta c-erbA protein ( hc-erbA , Weinberger^ et al. , 
1986) , 

human oestrogen receptor (hER, Green^et al., 1986), 
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rabbit progesterone receptor (rPR, Loosf eiy et al . , 1986), 



and 



human glucocorticoid receptor (hGR, Hollenberg et . , 
1985) . 

A minimal number of gaps (-) was introduced in the alignment. 

Amino acid residues matched in at least three of the poly- 
peptides are boxed in Figure 6. The codes for amino acids are: 



A 


Ala 


Alanine 


C 


Cys 


Cysteine 


D 


Asp 


Aspartic Acid 


E 


Glu 


Glutamic Acid 


F 


Phe 


Phenylalanine 


G 


Gly 


Glycine 


H 


His 


Hist idine 


I 


lie 


Isoleucine 


K 


Lys 


Lysine 


L 


Leu 


Leuc ine 


M 


Met 


Meth ionine 


N 


Asn 


Asparagine 


P 


Pro 


Prol ine 


Q 


Gin 


Glutamine 


R 


Arg 


Arg inine 


S 


Ser 


Serine 


T 


Thr 


Threonine 


V 


Val 


Valine 


w 


Trp 


Tyrptophan 
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Y Tyr Tyrosine. 
The sequence comparison analysis revealed that the two re- 
gions highly conserved in the thyroid/steroid hormone receptors 
are similarly conserved in the hap product. Consequently, the 
overall organization of hap is much similar to that of the four 
receptors in that it can be roughly divided into four regions 
(arbitrarily referred to as A/B , C, D and E (Krust^et al. , 
1986) ) . 

In C, the most highly conserved region, extending from 
amino-acid 81 to 146 in hap , the nine cysteines already conserved 
between the four known receptors are strikingly present at the 
same positions. Comparison between the cysteine-r ich region of 
hap with the corresponding region of the four receptors reveals 
64% amino acid identity with hc-erbA , 59% with hER, 42% with rPR 
and 44% with hGR. This is schematically represented in Fig. 7. 

Referring to Fig. 7, a schematic alignment of the five pro- 
teins can be seen. The division of the thyroid/steroid hormone 
receptor regions A/B, C, D, E is schematically represented in the 
hap protein. The two highly conserved regions, identified as the 
putative DNA-binding (region C) and hormone-binding (region E) 
domains of the receptors, are shown as stippled blocks. The num- 
bers refer to the position of amino acid residues. The sequences 
of each of the hc-erbA product, hER, rPR and hGR receptors are 
compared with the hap protein. The numbers present in the stip- 
pled blocks correspond to the percentage of homology between hap 
protein on the one hand and each of the receptors on the other 
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hand in the two highly conserved regions C and E. The empty 
blocks correspond to the non-conserved A/B and D regions. 

It has also been found that hap shares 47% homology in the C 
region with the chicken vitamin D3 receptor (VDR) , recently 
cloned as a partial cDNA (McDonnel^ et aj„. , 1987) (data not 
shown). Apart from c-erbA , which contains two additional resi- 
dues, the 66 amino acid long C region shows a constant length in 
hER, VDR, hGR, rPR and hap sequences. 

Region E (residue 195-448), which is well-conserved, but to 
a lesser extent, shows a slightly stronger homology to hc-erbA 
(38%) (Fig. 7). The hap / hc-erbA homology, however, remains infe- 
rior to the identity found between hGR and rPR (90 and 51 per 
cent in regions C and E, respectively). No significant homology 
was observed when comparing the A/B (residue 1-80) and D 
(147-194) regions which are similarly variable, both in sequence 
and length, in the four known receptors. 

It is thus evident from Figs. 6 and 7 that the hap product 
exhibits two highly homologous regions. The C domain is charac- 
terized by strikingly conserved Cys-X2-Cys units, evoking those 
found in the DNA-binding transcriptional factor TFIIIA (Miller^et 
7 3JLw 1985) and in some protein that regulated development, such 
as Kruppel (Rosenberg^et al . , 1986). In the latter, the Cys-X2- 
Cys, together with His-X3-His units, can form metal binding fin- 
gers that are crucial for DNA-binding (Berg, 1986; Diakun^et al.. , 
1986). Similarly, the C domain of previously cloned receptors 
are likely to contain metal binding fingers and were shown to 
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bind DNA (Hollenberg^et al. , 1987; Miesf eld^ et _al. , 1987). Since 
the C region of the hap gene product shares 24/66 conserved amino 
acids with all all steroid or thyroid hormone receptors, 
including all nine cysteine residues, it is likely that the hap 
protein is a DNA-binding protein. Hap, as c-erbA /steroid 
receptors, may modulate the transcription of target genes. 

In addition, the significant homology detected in the E 
domain suggests that hap product is a 1 igand-binding protein and 
directs the question of the nature of the putative ligand. Hap 
protein seems to differ too much from previously cloned hormone 
receptors to be a variant of one of them. In addition, the in 
vitro translated 51 K hap polypeptide failed to bind all ligands 
tested. Although that hap gene product could be a ligand- 
independent DNA-binding protein, it is believed that hap encodes 
a receptor for a presently unidentified circulating or in- 
tracellular 1 igand. 

It has been proposed that steroid and thyroid hormone 
receptor genes were derived from a common ancestor (Green and 
Chambon, 1986). This primordial gene may have provided to the 
receptors their common scaffolding while the hormone and target 
gene cellular DNA specificities were acquired through mutations 
accumulated in the C and E domains. Hap is both linked to the 
steroid receptor gene by its shorter C doma i n ( - 6 6 AA*fc and to the 
thyroid hormone receptor genes by its clearly greater homology 
with c-erbA in the E region (38%). This suggests that hap ligand 
may belong to a different hormone family. 
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Different functions have been assigned to the four regions 
defined in the glucocorticoid and oestrogen receptors (Kumar^et^ 
aj., 1986; Giguere et al., 1986; Miesf eld^et al . , 1987). By 
analogy, the regions C and E may represent, respectively, the pu- 
tative DNA-binding and hormone-binding domains of the hap pro- 
tein. The precise functions of the A/B and D domains remain 
unknown. The presence of the amino-terminal A/B region of the 
human GR has been recently shown to be necessary for full tran- 
scriptional activity (Hollenberg^et al . , 1987), whereas results 
obtained with the rat GR indicated it was dispensable (Miesfeld 
et al., 1987). From this alignment study it appears that hap is 
distinct, but closely related to the thyroid/steroid hormone 
receptor genes suggesting that its product may be a novel ligand- 
dependent, DNA-binding protein. 

A. 7. Hap related genes 

Southern blotting was performed on restriction enzyme- 
digested DNAs obtained from different organisms with labelled 
genomic MNT fragment containing the first exon of the cysteine- 
rich region of hap . The results are shown in Fig. 8. More par- 
ticular ly, hap related genes in vertebrates (A) and in humans (B 
and C) were compared. Cellular DNA (20 jaq) from various sources 
was digested with Bql ll and subjected to Southern blot analysis 
using the MNT probe under non-stringent hybridization and washing 
conditions. The lanes in Fig. 8A are identified as follows: 

Lane a human liver 
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Lane b domestic dog liver 

Lane c woodchuck (marmota monax) 

Lane d mouse liver (BALB/c strain) 

Lane e chicken erythrocytes 

Lane f cartilaginous fish (Torpedo). 

As illustrated in Fig. 8A, Bql ll fragments that anneal 
effectively with MNT probe under non-stringent hybridization and 
washing conditions are present in digests of DNA from several 
mammals (mouse, woodchuck, dog) as well as from bird and fish. 
If this blotting experiment is performed at high stringency, no 
hybridization is observed with heterologous DNA (data not shown). 
These data suggest that the hybridizing sequences represent evo- 
lutionarily conserved homologs of hap . 

The existence of multiple c-erbA and GR genes (Jansson^ei 
u^U , 1983; Weinberger^ et al., 1986; Hollenberg^^i .al.. , 1985) 
encouraged a search for hap related genes in the human genome. 
Thus, human liver DNA digested by Pst I , Bam HI , and Eco RI was ana- 
lyzed by Southern blot, using the MNT probe, under stringent con- 
ditions. The results are shown in Fig. 8B. After digestion of 
liver DNA by Pst I (lane a), Bam HI (lane b) , or Eco RI (lane c) , a 
single band is observed with the MNT probe in high stringency hy- 
bridization. 

The same blot was hybridized with the MNT probe under non- 
stringent hybridization and washing conditions. The results are 
shown in Fig. 8C. When Southern blotting was performed under re- 
laxed hybridization conditions, additional bands were observed in 
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the products of each enzyme digestion (Fig, 8C, lanes a, b, c) . 
For example, seven faint hybridizing fragments of 1, 1.7, 2.4, 
3.8, 5.5, 6, 7.4 kb were observed in the Bam HI digestion (lane 
b) . None of those bands cross-hybridized with a human c-erbA 
probe (data not shown) . A minimum of three faint bands in the 
Pst I lane suggests the existence of at least four related hap 
genes in the human genome. 

From a panel of somatic cell hybrids, hap was assigned to 
chromosome 3 (Dejearyet &L. , 1986). To find out whether the hap 
related genes were all chromosomally linked or not, DNAs from 
human liver LA.56U and 53K cell-lines (two mouse/human somatic 
cell hybrids containing, altogether, most human chromosomes 
except chromosome 3 (Nguyen Van Cong^e£ aJU , 1986)), and mouse 
lymphoid cells were Bam HI digested, transferred to 
nitrocellulose, and hybridized to the MNT probe in low-stringency 
conditions. Of the seven faint bands present in the human liver 
DNA track, two at least were conserved in the LA.56U and/or L.53K 
cell line^ DNA^ di^Mt ioa (data not shown) indicating that some 
of the hap genes do not localize on chromosome 3. Altogether the 
results suggest that hap belongs to a multigene family consisting 
of at least four members dispersed in the human genome. 
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The experimental procedures used in carrying out this inven- 
tion will now be described in greater detail. 
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A, 8. EXPERIMENTAL PROCEDURES 

A. 8.1. cDNA Cloning and Screening 

Briefly, the cDNA was synthesized using oligo dT primed 
poly-A+ liver mRNA , using the method of Gubler and Hoffman (1983) 
(C. de Taisne, unpublished data). ^cDNar' o were size selected on a 
sucrose gradient and the fraction corresponding to a mean size of 
3 kb was treated with EcoRI methylase. After addition of EcoRI 
linkers, the cDNA was digested by Eco RI and ligated to an Eco RI 
restricted lambda-NM1149 . After in vitro encaps idat ion , the 
phages were amplified on C600 hfl and 2.10 6 recombinant were 
plated at a density of 10,000 per dish. The dishes were trans- 
fered to nylon filters and hybridized to the 350 bp Eco RI- Eco RI 
k genomic fragment (MNT) previously described (Dejeary et _al . , 

1986). Four positive clones were isolated and the restriction 
map of each insert was determined. The longest one, clone 
lambda-13, was subjected to nucleotide sequence analysis. 
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A. 8. 2. Nucleotide Sequence 

Clone lambda-13 DNA was sonicated, treated with the Klenow 

la JU, istj 

fragment of DNA polymerase plus deoxyribonucleo tides (■ 2 Ti , lCr^C) 
and fractionated by agarose gel electrophoresis. Fragments of 
400-700 bp were excised and electroeluted. DNA was ethanol- 
precipitated, ligated to dephosphorylated Sma l cleaved M13 mp8 
replication form DNA and transfected into Excherichia coli strain 
TG-1 by the high-efficiency technique of Hanahan (1983). 
Recombinant clones were detected by plaque hybridization using 
either of the four Eco RI fragments of cDNA insert as probes (Fig. 
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1). Single-stranded templates were prepared from plaques 
exhibiting positive hybridization signals and were sequenced by 
the dideoxy chain termination procedure (Sanger et al., 1977) 
using buffer gradient gels (Biggiry et al . , 1983). 



A. 8.3. 



Northern Blot 



Cytoplasmic RNA was isolated from the fresh tissue using 
guanidine thiocyanate, and the RNA cell line was extracted using 
isotonic buffer and 0.5% SDS f 10 °A Na acetate pH 5.2. RNAs were 
then treated with hot phenol. Poly(A)+ RNA (15 ^«*§-) of the dif- 
ferent samples were separated on a 1% agarose gel containing 
glyoxal, transfered to nylon filters and probed using the nick- 
translated MNT fragment. The experimental procedure is described 
in Maniatisy^t al.. (1982). 



20 of genomic DNA was digested to completion, fraction- 



A.8.4. Southern Blot 

ated on a 0.8% agarose gel and transfered to nylon paper. Low 
stringency hybridization was performed as f ollows : JA— h- 
prehybridization in 35% formamide, 5x Denhardt, 5x SSC, 300 
ug/ml denatured salmon sperm DNA, at 4 0 °C ; / Jr8—h- hybridization 
with 35% formamide, 5x Denhardt, 5x SSC, 10% Dextran sulfate, 
2.10 6 cpm/ml denatured 32 P labelled DNA probe (specific activity 
5.10 8 cpm/^) . Washes were made in 2x SSC, 0.1 SDS, 55°C for 15 
min. High stringency hybridization conditions were the same 
except that 50% formamide was used with^S- 4 h - hybridization. 
Washing was in O.lx SSC, 0.1 SDS, 55°C for 30 min. 
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A. 8. 5. Construction of Plasmids for In-Vitro Translation 

The 3 kb insert of phage lambda-13 was excised from the 
phage DNA by partial Eco RI digestion, electroeluted and digested 
by Bam HI and Hind lll . To remove most of the untranslated se- 
quences, the 1.8 kb cDNA fragment obtained was then partially 
digested by Mae l (Boehr inger ) . The 1.4 kb Mae l- Mae l fragment, 
extending from the first to the third Mae l site in the cDNA in- 
sert sequence (Fig. 1) and containing the complete coding region 
was mixed with Smal cut dephosphorylated pTZ18 (Pharmacia), the 




extremities were filled in using K - l can cw - fragment of DNA Pol l 
(Amersham) and ligated. Two plasmids were derived: pCOD20 
(sense) and pC0D14 (antisense). 



t 
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A. 8 . 6 . 



Translation and hormone binding assays 



pCOD20 and pC0D14 were linearized with Hindlll. Capped mRNA 



aA 



was generated using 5 ^ of DNA, 5 uM rNTP, 25 mM DTT, 100 U 
RNAsin (Promega), 50 U T7 Pol (Genofit) in 40 mM Tris pH 8 f/ jBfeM 
MgCl2, 2 mM spermidine, 50 mM NaCl , in 100^1 at 37°C. Capping 
was performed by omitting GTP and adding CAP (m 7 G (5') ppp (5') 
G) (Pharmacia) for the 15 first minutes of the reaction. Trans- 
lation was performed using rabbit reticulocyte lysate (Amersham) 
under the suggested conditions using 40 ul of lystae for 2.5 ^ug- 
of capped RNA. 

The thyroid hormone binding assays included 5 ul of lysate 



in (0.25 M sucrose, 0.25 KCl,io20 mM Tris (pH 7.5), la 2^mM MgCl2, 2 

with 1 mM,^ T4, T3 or. 



mM EDTA, 5 mM DTT) 



rT3 (spe- 



cific activity: T4, rT3 1400 mCi/mg Amersham, T3 3000 mCi/mg 
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MEN), After at least 2-H of incubation at 0°C, free was sepa- 



rated from bound by filtration through millipore HAWP 02500 fil- 
ters using 10 ml of ice cold buffer. For testosterone, retinol, 

retinoic acid 10 -Sri of lysate were added to 45 lambda of 20 mM 

a* 

Tris pH 7.3, 1 mM EDTA, 50 mM NaCl , 2 mM beta-mercaptoethanol and 
5 mM testosterone, 400 mM retinol or 15 mM retinoic acid (81 
Ci/mmol; 60 Ci/mmol; 46 Ci/mmol; Amersham) . After an overnight 
incubation at 0°C free was separated from bound by Dextran coated 
charcoal (0.5% Norit A - 0.05% T70) and centrifugat ion. All 
experiments were performed in duplicates and parallel experiments 
were performed with 100 fold excess corresponding cold hormone. 
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B. DIFFERENTIAL EXPRESSION AND LIGAND REGULATION OF THE 
RETINOIC ACID RECEPTOR a AND B GENES 

The recent cDNA cloning of several nuclear hormone 
receptors, including the steroid and thyroid hormone receptors, 
has revealed that their overall structures were strikingly simi- 
lar. In particular, two highly conserved regions have been shown 
to correspond to the DNA- and hormone-binding domains (for review 
see Evans, 1988) . 

Analysis of a hepatitis B virus integration site in a human 
hepatocellular carcinoma led to the identification of a putative 
genomic exon highly homologous to the DNA-binding domain of other 
members of this nuclear receptor multigene family (Dejean^et^ al . , 
1986). Two different cDNAs homologous to this sequence have re- 
cently been cloned (Giguere gt ^1 . , 1987; Petkovict^et al . , 1987; 
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de Th(^gj:_al., 1987) and their translation products identified as 
retinoic acid receptors (designated RAR a and RAR 8) (Giguere^et 
; ,al., 1987; Petkovicl^ ejfc 1987; Brandy et .ai . , 1988). The two 

receptors have almost identical DNA- and hormone-binding domains 
but differ in their N-terminal part. Their respective genes map 
to different chromosomes, 17q21.1 for RAR a (Matte^ et al. , 1988) 
and 3p24 for RAR 8 (Matte ^ et al . , 1988), and their nucleotide 
sequences are only distantly related. Both genes are found in 
most species (Bran^ et al . , 1988 and de The, unpublished 
results), suggesting an early gene duplication. Analysis of the 
RA-dependent gene transact ivat ion also showed that the ED 50 of 

_ Q _ Q 

RAR a and 8 were significantly different (10 ° and 10 * M, re- 
spectively), indicating that RAR- 6 may mediate activation of 
transcription at RA concentrations 10-fold lower than those nec- 
essary for activation by RAR a (Brand et _al., 1988). 

The existence of two different retinoic acid receptors 
raises a number of questions as to the biological consequences of 
the RAR gene duplication. In particular, differences in the 
mechanisms of regulation or spatial expression patterns of the 
two receptors could account for distinct physiological roles. 
The tissue distribution of the transcripts for RAR a and 8 and 
their response to RA have been studied. The results show clear 
differences in the spatial patterns of expression and indicate 
that the 8, but not the a, RAR gene is transcriptionally 
upregulated by RA in a protein synthesis-independent fashion. 
The discovery of differential expression of the RAR a and 8 
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genes, coupled with a selective regulation of RAR 6 gene expres- 
sion by RA, may prove to be important components of retinoic acid 
physiology. These findings strongly suggest that the two 
receptors are differentially involved in the various biological 
effects of RA. The results obtained in the study are summarized 
below. 

The RAR a gene, which is transcribed as two mRNA species of 
3.2 and 2.3 kb, is overexpressed in the haematopoietic cell-lines 
and has an otherwise low level-expression in all the .other human 
tissues examined. By contrast, the RAR 8 gene exhibits a much 
more varied expression pattern. Indeed, the two transcripts, 3 
and 2.5 kb, show large variations in their levels of expression 
which range from undetectable (haematopoietic cell-lines) to rel- 
atively abundant (kidney, cerebral cortex, etc.). Run-on studies 
with the hepatoma cell-lines show that, at least in some tissues, 
these differences may be due to an increase in the transcription 
rate of the RAR 8 gene. These findings point to complex regula- 
tory mechanisms of RAR gene expression that may confer the cells 
with various sensitivities to RA. 

The availability of cloned RAR cDNAs prompted an investiga- 
tion of possible regulation of these receptor mRNAs by RA. Expo- 
sure of hepatoma cells to RA led to a rapid increase in the level 
of RAR 8 transcripts, while the abundance of RAR a transcripts 
remained unaffected. The stimulation of expression of RAR B 
mRNAs was induced by physiological concentrations of RA in a 
dose-dependent manner. Such autoregulat ion is a general feature 
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of hormonal systems and has been shown to take place at the mRNA 
and protein levels, in the case of the nuclear receptors for 
glucocorticoids (down-regulation, Okren^ et^ al . , 1986) or vitamin 
D3 (up-regulation, McDonnell^ et al., 1987). The RA-induced 
upregulation of the RAR B transcripts was observed in the pres- 
ence of protein synthesis inhibitors, in vitro nuclear tran- 
script run-on assays show that the RA-induced increase in RAR 8 
mRNAs levels is the consequence of an enhanced transcription. 
These findings demonstrate that the RAR B gene is transcrip- 
tionally upregulated by the RA and provide the first identifica- 
tion of a primary target gene for RA. The cloning of the 
promoter sequences of the RAR B gene should allow the identifica- 
tion of the upstream genomic elements implicated in RA respon- 
siveness. The use of these sequences will provide a useful tool 
to determine which one of the a and/or the 8 receptor is involved 
in regulating 8 RAR gene expression. 

The haematopoietic cell-line HL60 has been widely used as a 
model for RA-induced differentiation (Strickland and Mahdavi, 
1978). The data from this invention suggest that in this system 
RAR a must be responsible for the RA-induced differentiated 
phenotype, since HL60 does not appear to have any RAR B mRNAs. 
Note in this respect that Davies^et .al. (1985) studying the 
RA-dependent transglutaminase expression in these cells have 
found an ED 50 of bx-K) consistent with a RAR a-mediated trans- 
activation. 
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The upregulation of the 8 receptor gene by RA may have very 
important implications in developmental biology. Morphogen gra- 
dients are frequently implicated in cell commitment (Slack, 
1987). One example of this phenomenon is the polarization of the 
chick limb bud where RA, the suspected morphogen, forms a concen- 
tration gradient across the anterior-posterior axis of the 
developing bud (Thaller and Eichele, 1987). However, the small 
magnitude of this gradient (2.5 fold) is puzzling and suggests 
the existence of amplification mechanisms (Robertson, 1987). 
Since transact ivation of target genes is dependent upon both 
receptor and ligand concentrations, a small increase in RA may 
result in a disproportionately larger RAR B effect. The effect 
of this RA gradient could be potentiated by a corresponding gra- 
dient in RAR 8 receptors as a consequence of upregulation by RA 
itself . 

B.l. Tissue distribution of the a and B RAR mRNAs. 

To study the differential expression of the RAR a and B 
genes, Northern blot analysis was performed using 5 ug 
(microgram) of poly(A) + RNA extracted from various human tissues 
and cell-lines. A RAR B clone previously identified (de The et 
a,l. , 1987) was used to isolate a partial cDNA clone for RAR a 
from a hepatoma cell-line cDNA library, and the two cDNA inserts 
were used as probes. More particularly, poly(A) + mRNA (5 \xq) 
from different human tissues and cell-lines was denatured by 
glyoxal, separated on a 1.2% agarose gel, blotted onto nylon fil- 
ters and hybridized to an a (Fig. 9, upper panel), then a 8 (Fig. 
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9, middle panel) RAR cDNA single-stranded probe (see materials 
and methods, infra ) . Exposure time was The filters were 

subsequently hybridized to a 6 actin probe (Fig. 9, lower panel) 
to ensure that equal amounts of RNA were present in the different 
lanes. The following abbreviations are used in Fig. 9. Sp. 
cord: spinal cord. C. cortex: cerebral cortex. K562 and HL60 are 
two haematopoietic cell-lines. PLC/PRF/5 is a hepatoma derived 
cell-line. 

Referring to Fig. 9, the spatial distribution patterns were 
clearly distinct between the two receptors. The RAR a probe hy- 
bridized to two transcripts of 3.2 and 2.3 kilobases (kb) with an 
approximately equal intensity. The two mRNAs were present at low 
levels in all tissues examined but were overexpressed in the 
haematopoietic cell-lines, K562 and HL60. 

When the same filters were hybridized with the RAR 8 probe, 
a much more variable transcription pattern was observed (Figure 
9). Two mRNA species of 3 kb and 2.5 kb were visible in most 
tissues, except in the spinal cord and the liver (adult or fetal) 
where the smaller transcript was undetectable. Major quantita- 
tive differences in the level of expression of the two tran- 
scripts were noted. The tissues examined could be classified 
into four groups with respect to expression of B receptor mRNAs: 
high (kidney, prostate, spinal cord, cerebral cortex, PLC/PRF/5 
cells), average (liver, spleen, uterus, ovary), low (breast, tes- 
tis) and undetectable (K562 and HL60 cells). The use of a B 
probe that did not hybridize to a, allowed us to correct our 
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previous description of 8 RAR transcripts in these haematopoietic 
cell-lines (de The^fii aL- , 1987). The suppression of B receptor 
gene expression, associated with an overexpress ion of RAR a mRNAs 
seems to be a general feature of haematopoietic cell-lines, since 
similar results were obtained when we repeated the study using 
six other cell-lines (HEL, LAMA, U937, KG1 , CCRF, Burkitt) (data 
not shown) . 

B.2. RA- induced mRNA regulation . 

To investigate whether retinoic acid modulates the expres- 
sion of its own receptor, PLC/PRF/5 cells were grown in the pres- 
ence of various concentrations of RA for different times, and RAR 
a and 8 mRNAs were analysed by Northern blot hybridization. More 
particularly, semi-confluent cells were grown for 6 hr in char- 
coal stripped medium and retinoic acid was then added to the me- 
dium at various concentrations (10~ 10 M to 10" 6 M) for 4 hr. 
Control cells were treated with ethanol (E) . Northern-blotting 
was performed as described in connection with Figure 9, except 
that 30 pg of total RNA was used. Dose-response is shown in Fig. 
10A. 

<, Another analysis was performed as in Fig. 10A, except that 

Id Co - ) 3 m 

lQ'^M RA was used for various times (0 12 h -h Time-response is 
shown in Fig. 10B. Exposure time was 12 hr for the B probe (Fig. 
10B, lower panel) and four days for the a probe (Fig. 10B, upper 
panel ) • 

When the cells were treated with a high concentration of RA 
(lO -6 M) , a rapid increase in 8 receptor mRNAs was observed, and 
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a dose-response analysis showed that this stimulatory effect was 
already evident at a RA concentration of 10" 9 M (Fig. 10A, lower 
panel). From densitometry, the magnitude of the RA-induced 
upregulation was 10-fold. 

Since the PLC/PRF/5 cells const i tut ively ovcroxpp -es-s the RAR 
B mRNAs (Fig. 9), the experiment was repeated using the HEPG2 
hepatoma cell-line, which has a level of RAR B expression similar 
to that of normal adult liver (de Thecal., 1987). In this 
case, there was a greater (50-fold) RA-induced stimulation of the 
levels of RAR B mRNAs (data not shown). Exposure of the 
PLC/PRF/5 cells to RA (10~ 6 M) during various periods indicated 
that the induction had a latency of one hour, was complete after 
four hours, and did not decrease after an overnight treatment 
(Fig. 10B, lower panel). After hybridizing the same filters with 
an RAR a probe, no variation was found in the level of the a 
receptor mRNAs (Fig. 10, upper panel), indicating that RA had no 
effect on the expression of the RAR a gene. 

B.3. Effect of inhibitors . 

To investigate the mechanism of activation of RAR B gene by 
RA, experiments with PLC/PRF/5 cells were performed in the pres- 
ence or absence of various inhibitors of transcription or trans- 
lation, or were treated with ethanol (E) as a control. 



More particularly, PLC/PRF/5 cells were exposed to^charcoal 
stripped medium for 6 hr; subsequently ethanol (E) , RA — M-h 
and/or inhibitors cycloheximide (CH) 10 ug/ml or actinomycin D 
(AC) (5 ug/ml) were added for an additional 4 hr. 
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Northern-blotting was carried on using 30 ug of total RNA. Fig- 
ure 11 shows filters hybridized first to the RAR B probe (Fig. 
11, right panel), then to the a probe (Fig. 11, left panel), and 
finally to a B actin probe (Fig. 11, lower panel). Exposure 
times were the same as for the experiments in Figure 10. 

The RNA synthesis inhibitor actinomycin D (AC) abolished the 
RA-induced increase in the levels of RAR 8 transcripts (compare 
the RA+AC lane to the RA and E+AC lanes), while the protein syn- 
thesis inhibitor cycloheximide (CH) did not (compare lanes RA+CH 
to CH). Neither RA, AC, nor CH significantly affected the levels 
of B actin mRNA (Fig. 11, lower panel). These findings suggest 
that RA-induction of the 8 receptor gene results from a direct 
transcriptional effect. When the same filters were rehybridized 
to the RAR a probe (Fig. 11, left panel) the presence or absence 
of RA had no effect on the levels of RAR a mRNAs confirming that 
the RAR a gene is not regulated by RA. 
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B.4. Nuclear transcript elongation analysis . 

Nuclear run-on experiments were carried out to determine if 
the enhanced expression of the RAR 8 gene was due to increased 
transcription. PRF/PLC/5 cells were grown in the presence of 
ethanol (E) or retinoic acid (RA), their nuclei were isolated, 
and transcription was performed in the presence of ( 32 P)UTP. The 
labelled RNAs were hybridized to filters containing 
single-stranded RAR B cDNA inserts in the appropriate orientation 
(S (sense) 10 ug and 1 ug), or in the reverse orientation (AS 
(antisense) 20 ug). A B actin control was also included. 
Exposure time was 12 hours. The results are shown in Figure 12. 
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The specific hybridization, which reflects the transcription 
rate, is clearly induced by RA. In addition, the magnitude of 
the increase in RAR 8 mRNAs is comparable when assessed by run-on 
assays (5 to 7 fold) or Northern analysis (8 to 10 fold). These 
experiments establish that the RAR B gene is transcriptionally 
upregulated by RA. 

Nuclear transcript elongation assays were also used to in- 
vestigate whether the higher steady-state levels of RAR 8 mRNAs 
observed in the hepatoma cells PRF/PLC/5 compared to HEPG 2 (de 
The et al., 1987), were related to differences in transcription 
rates. Transcript elongation assays were performed with 
PRF/PLC/5 and HEPG2 cells as described below in material and 
methods, in the absence of added RA. The filters contained, re- 
spectively, 10 yg and 20 yg of sense (S) and antisense (AS) RAR 8 
cDNA inserts. Exposure time was 24 hours. The results are shown 
in Figure 13. 

A much greater specific hybridization signal, relative to 
the 8 act in control, was observed in PRF/PLC/5 cells compared to 
the HEPG 2 cells (Fig. 13), indicating that their transcription 
rates are different. This result suggests that at least some of 
the variations in RAR 8 expression in the human tissues and 
cell-lines (Fig. 9) might be due, in a similar manner, to differ- 
ences in the transcription rates of the RAR 8 gene. 
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B.5. Stability of RAR mRNAs 

The level of RAR 6 mRNAs was slightly higher after 
cycloheximide treatment (compare the E lane to the CH lane in 
Fig. 11, right panel). In the presence of RA, CH treatment 
caused approximately a 50-fold increase in the level of RAR 8 
gene expression (compare lane E to RA+CH) . Such super induct ion 
by cycloheximide has been described for several genes and associ- 
ated with either transcriptional or post-transcr ipt ional mecha- 
nisms (Greenberg^et al., 1986). 

To determine whether RNA stabilization was involved in the 

induction by CH,* PLC/PRF/5 cells were first stimulated for 3 

hours by RA (-10 b M -) in the presence of CH (10 pg/ml) and exten- 
v 

sively masked with culture medium. Transcription was then 
blocked by addition of actinomycin D (5 ug/ml) and the level of 
RAR mRNAs was monitored for the next 5 hours in the presence or 
absence of CH. Northern-blotting was done using 30 ug of total 
RNA. The results are shown in Figure 14. The filters were hy- 
bridized first to the RAR 6 probe (Fig. 14 f right panel), then to 
the a probe (Fig. 14, left panel), and lastly to a B actin probe 
(Fig. 14, lower panel). Exposure times were as in Figure 10. 

Quantification of the RAR B mRNAs levels indicated that CH 
indeed stablized the B transcripts, as CH increased their 
half-life from approximately 50 to 80 min (Fig. 14, right panel). 
The combined effect of increased transcription and reduced degra- 
dation may account for the synergistic effect of RA and CH on 8 
mRNAs levels. In the case of RAR a, cycloheximide treatment 
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caused only a slight increase in mRNAs levels and no 

superinduct ion by RA was observed (Fig. 11, left panel). In 

addition, the a receptor mRNAs , which have a half life of at 

least 5 hours, are more stable than the RAR B transcripts (Fig. 

14, left panel). A pentanucleot ide , ATTTA, in A/T rich 3' 

non-coding regions seems to mediate mRNA degradation (Shaw and 

Kamen, 1986). The 3.2 kb RAR a transcript has an A/T poor 3' end 

(38%) and contains two such motifs (Giguer^et al., 1987; 

Petkovich et al., 1987), whereas the 3 kb RAR B mRNA has an A/T 
7* — — > 

rich 3' end (68%) and four copies of ATTTA (de The et al., 1987). 
These findings are consistent with the differences in RAR a and B 
mRNAs stability that have been observed. 



Is 



B.6. MATERIAL AND METHODS 

B.6.1. Biological samples and cell-lines . 
Human tissue samples were obtained from early autopsies and 
kept at -80°C prior to extraction. The HEPG 2 and PLC/PRF/5 
hepatoma cell-lines were grown in Dulbecco's modified Eagle's me- 
dium with 10% fetal calf serum, glutamine, and ant ibiot^^s^, in 5% 
CO2. Semiconf luent cells were treated with RA after a ^6 h - wash- 
out in charcoal stripped medium. All-trans-ret inoic acid was 
obtained from Sigma. Cycloheximide and actinomycin D (both from 
Sigma) were used at concentrations of 10 and 5 ug/ml (micrograms/ 
milliliter.) , respectively. 
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B.6.2. RNA preparat ion . 

The RNA was prepared by the hot phenol procedure (Maniatis 
et al., 1982). Poly(A) + mRNA was prepared by oligo(dT)-cellulose 
chromatography. For Northern-blot analysis, total RNA (30 ug) or 
poly(A) + mRNA (5 ug) was denatured by glyoxal and fractionated on 
a 1.2% agarose gel (Maniatis et.al., 1982). The nucleic acid was 
transferred to nylon membranes (Amersham) by blotting and 
attached by UV exposure plus baking. 
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B.6.3. Recombinant clones . 

The 6 receptor probe was a 600 bp fragment of the cDNA pre- 
viously described (de The, e£ .al., 1987) extending from the 5' end 
to the Xho I site, corresponding to 5 f untranslated region and 
the A/B domain. The a receptor probe was a short cDNA insert 
that was isolated from a PLC/PRF/5 human hepatoma cell-line cDNA 
library generated as described (Watson and Jackson, 1986). This 
library was hybridized with an RAR B-derived probe (nucleotides 
550 to 760) corresponding to the conserved DNA-binding domain of 
RAR 8. A weakly hybridizing plaque was purified, subcloned into 
M13mpl8, and sequenced by the dideoxy procedure. This clone was 
found to be identical to RAR a and extended from nucleotides 358 
to 587, corresponding to the C and D domains (Giguere^et al . , 
1987). Since this cDNA insert contains some regions homologous 
to the RAR B cDNA, cross-hybridization has been occasionally ob- 
served, particularly in cell-lines that overexpress RAR B mRNAs. 
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B.6.4. Hybridization procedure . 

The two cDNA inserts were subcloned into M13 and used to 
generate high specific activity (greater than 10 9 c.p.m./ug) 
single-stranded probes by elongation of a sequencing primer with 
32 P labelled dTTP (3000 Ci/mmol) and unlabelled nucleotides by 
Klenow polymerase. The resulting double-stranded DNA was 
digested using a unique site in the vector, fractioned on a 
urea/acrylamide sequencing gel, and the labelled single-stranded 
insert electroeluted. These probes (5xl0 6 cpm/ml) were hy- 
bridized to the filters in 7% (w/v) sodium dodecyl sulfate (SDS), 
0.5 M NaP04 pH 6.5, 1 mM ethylenediaminetetraacetate (EDTA) , and 
1 mg/ml bovine serum albumin ( BSA) at 68°C overnight. The fil- 
ters were washed in 1% SDS, 50 mM NaCI, 1 mM EDTA at 68°C for 10 
min and autoradiographed at -70°C using Kodak XAR films and in- 
tensifying screens. A mouse B actin probe was used to 
rehybridize the filters and check that all lanes contained equal 
amounts of RNA. 

B.6.5. Nuclear run-on experiments . 

Nuclear transcript elongation assays were performed as 
described (Mezger^^t al. , 1987). PLC/PRF/5 or HEPG 2 cells (10 8 ) 
were challenged with ethanol or with 10" 6 M RA for 6 hours in 
charcoal-stripped medium. After isolation of the nuclei, tran- 
scription was performed in a final volume of 100 ul (microliters) 
with 150,u€i (microcuries) of (a 32 P) UTP (3000 Ci/mmol). Typical 
incorporation ranged between 2 and 6x10 cpm. The labelled RNA 
was hybridized to nylon filters (Amersham) containing 10 ug and 1 



-54- 



LAW OFFICES 

Finn ec an, Hende 
Farabow. Garrett 

S DUNNE' 

1775 K STREET. N. W. 
WASHINGTON. O. C.20006 
(2021 293-6850 




4 



yg of a 3' end RAR 8 cDNA insert (position 2495 to 2992, de The 
et al., 1987) cloned in M13; 20 ug of the same insert in the re- 
verse orientation were included as a negative control. A plasmid 
containing a mouse 8 actin insert (4 tag) provided a positive and 
quantitative hybridization control. Hybridization was performed 
with a probe- concentration of 2-6xl0 7 cpm/ml for 48 hours. 

The relative intensity of hybridization signals in 
Northern-blotting and run-on experiments was estimated using a 
Hoefer scanning densitometer and the appropriate computer pro- 
gram. 

Our results showing a direct autoregulat ion of the tran- 
scription of the RAR-B gene implies that the retinoic acid 
receptor 8 binds to its own gene promotor sequences. To identify 
those sequences, several 5' coterminal RAR- 8 cDNA clones were 
derived from the PRF/PLC/5 library previously described. 
Nucleotide sequence analysis showed that these clones extended 
our previous X 13 RAR- 8 clone by 72 bp, which are shown in Figure 
15. Thus, this invention also provides the 72 bp nucleotide se- 
quence shown in Fig. 15, as well as a cloned DNA sequence 
encoding a polypeptide of hap gene, wherein the sequence has the 
formula 

CCCATGC 

GAGCTGTTT Q AAQQAC TGGGATGCCGAGAACGCGAGCGATCCGAGCAGGGTTTGTCTGGGCACCGT 
► ATGTTTOACTG T ftTGGATGTTCTGTCAGTGAGTCCTGGGCAAA'PCCTGATTG TACACTGCGAGTCC 
GTCTTCCTGCATGCTCCAGGAGAAAGCTCTCAAAGCATGCTTCAGTGGATTGACCCAAACCGAATG 



-55- 



• 



GCAGCATCGGCACACTGCTCAATCAATTGAAACACAGAGCACCAGCTCTGAGGAACTCGTCCCAAG 
■aXCGeft TCTCCACTTLL 1 LLiiC rCG AGTGAT €AAACCCTGCTTCGTCTGCCAGG ACAAATCATC 
AG 6GTACCACTA ¥€rGQ6 1 ' LAliHjLC r ij?GAGGGAT €AAGGGCTTTTTCCGCAGAAGTATTCAGAAG 
AATATGATTTACACTTGTCACCGAGATAAGAACTGTGTTATTAATAAAGTCACCAGGAATCGATGC 
CAATACTGTCGACTCCAGAAGTGCTTTGAAGTGGGAATGTCCAAAGAATCTGTCAGGAATGACAGG 
AACAAGAAAAAGAAGGAGACTTCGAAGCAAGAATGCACAGAGAGCTATGAAATGACAGCTGAGTTG 
(^r I ypc!p £^rLr,Aa\i r . wrrn i a a &a r rr arr a nn a ft ftrTTT rrr TTrAr jgP€QgAGCTGGGT 
AAATACACCACGAATTCCAGTGCTGACCATCGAGTCCGACTGGACCTGGGCCTCTGGGACAAATTC 
AGTGAACTGGCCACCAAGTGCATTATTAAGATCGTGGAGTTTGCTAAACGTCTGCCTGGTTTCACT 
GGCTTGACCATCGCAGACCAAATTACCCTGCTGAAGGCCGCCTGCCTGGACATCCTGATTCTTAGA 
ATTTGCACCAGGTATACCCCAGAACAAGACACCATGACTTTCTCAGACGGCCTTACCCTAAATCGA 
ACTCAGATGCACAATGCTGGATTTGGTCCTCTGACTGACCTTGTGTTCACCTTTGCCAACCAGCTC 
CTGCCTTTGGAAATGGATGACACAGAAACAGGCCTTCTCAGTGCCATCTGCTTAATCTGTGGAGAC 
CGCCAGGACCTTGAGGAACCGACAAAAGTAGATAAGCTACAAGAACCATTGCTGGAAGCACTAAAA 
ATTTATATCAGAAAAAGACGACCCAGCAAGCCTCACATGTTTCCAAAGATCTTAATGAAAATCACA 
GATCTCCGTAGCATCAGTGCTAAAGGTGCAGAGCGTGTAATTACCTTGAAAATGGAAATTCCTGGA 
TCAATGCCACCTCTCATTCAAGAAATGATGGAGAATTCTGAAGGACATGAACCCTTGACCCCAAGT 
TCAAGTGGGAACACAGCAGAGCACAGTCCTAGCATCTCACCCAGCTCAGTGGAAAACAGTGGGGTC 
AGTCAGTCACCACTCGTGCAAT AA , 

and serotypic variants thereof, wherein said DNA is in a purified 
form. 

This 72 bp sequence was used as a probe to screen a human 
genomic library. Six overlapping clones were derived, and a 6 kb 
Hindi II - Bam HI insert containing the probe was subcloned into 
PTZ 16 at the same sites to give rise to the plasmid pPROHAP. 
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Since this genomic DNA insert is limited by the Bam HI site pres- 
ent in the original X 13 clone and contains the additional 72 bp 
of the 5' end of the mRNA, it also contains the promoter region 
and all the elements necessary for the RAR-B gene expression and 
regulation. Preliminary SI analysis using the plasmid pPROHAP 
end labelled at the Bam HI site suggest that the cloned RAR-B cDNA 
are full-size and that the cap site is indeed located in the 90 
bp Bam HI- Eco RI fragment. 

A complete restriction map of the Hin di I I- Bam HI genomic DNA 
insert is shown in Figure 16. 

Plasmid pPROHAP was transfected into the coli strain 
DH5otF 1 (from B.R.L.). A viable culture of EL coli strain DH5otF' 
transformed with plasmid pPROHAP was deposited on November 29 , 
1988, with the National Collection of Cultures of Microorganisms 
or Collection Nationale de Cultures de Micro-organisms (C.N. CM.) 
of Institut Pasteur, Paris, France, under Culture Collection 
Accession No. C.N. CM. 1-821. 

This DNA insert, which is characterized by its restriction 
map and partial nucleotide sequence (or some of its fragment), 
provides a tool to assess RAR-B function, because it must contain 
a RAR responsive enhancer. Several constructs in which this 
promotor region controls the expression of indicator genes, such 
as the B-galactosidase or the chloramphenicol acetyl transferase 
(CAT), have been designed. Transient or stable expression, in 
eucaryotic cells, of these constructs, together with an expres- 
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Thus, this invention also provides a recombinant DNA mole- 
cule comprising a DNA sequence -erf* coding for a retinoic acid 
receptor, said DNA sequence coding^ef^ expression in a unicellular 
host for a polypeptide displaying the retinoic acid and DNA bind- 
ing properties of RAR-B and being operatively linked to an 
expression control sequence in said DNA molecule. 

It should be apparent that the foregoing techniques as well 
as other techniques known in the field of medicinal chemistry can 
be employed to assay for agonists and antagonists of ligand bind- 
ing to RAR-B and binding of the RAR-B protein to DNA. Specifi- 
cally, this invention makes it possible to assay for a substance 
that enhances the interaction of the ligand, the RAR-B protein, 
the DNA, or combinations of these materials to elicit an observa- 
ble or measurable response. The substance can be an endogenous 
physiological substance or it can be a natural or synthetic drug. 

This invention also makes it possible to assay for an antag- 
onist that inhibits the effect of an agonist, but has no biologi- 
cal activity of its own in the RAR-B effector system. Thus, for 
example, the invention can be employed to assay for a natural or 
synthetic substance that competes for the same receptor site on 
the RAR-B protein or the DNA that the agonist occupies, or the 
invention can be employed to assay for a substance that can act 
on an allosteric site, which may result in allosteric inhibition. 

It will be understood that this invention is not limited to 
assaying for substances that interact only in a particular way, 
but rather the invention is applicable to assaying for natural or 
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synthetic substances, which can act on one or more of the 
receptor or recognition sites f including agonist binding sites, 
competitive antagonist binding sites (accessory sites), and 
non-competitive antagonist or regulatory binding sites 
(allosteric sites). 

A convenient procedure for carrying out the method of the 
invention involves assaying a system for stimulation of RAR-8 by 
a retinoid. For example, as a retinoid binds to the receptor, 
the receptor-ligand complex will bind to the responsive promoter 
sequences and will activate transcription. For example, tran- 
scription of the B-galactosidase or CAT genes can be determined. 
The method of this invention makes it possible to screen 
8-receptor binding retinoids. In addition, this invention makes 
it possible to carry out blood tests for RAR-fJ activity in pa- 
tients . 



In summary, a hepatitis B virus (HBV) integration in a 147 
bp cellular DNA fragment homologous to steroid receptors and 
c-erbA /thyroid hormone receptor genes previously isolated from a 
human hepatocellular carcinoma (HCC) was used as a probe to clone 
the corresponding complementary DNA from a human liver cDNA li- 
brary. The nucleotide sequence analysis revealed that the over- 
all structure of the cellular gene, named hap , is similar to that 
of DNA-binding hormone receptors. That is, it displays two 
highly conserved regions identified as the putative DNA-binding 
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and hormone-binding domains of the c-erbA /steroid receptors. Six 
out of seven hepatoma and hepatoma-der ived cell-lines express a 
2.5 kb hap mRNA species which is undetectable in normal adult and 
fetal livers but present in all non-hepatic tissues analyzed. 
Low stringency hybridization experiments revealed the existence 
of hap related genes in the human genome. Taken together, the 
data suggest that the hap product may be a member of a new family 
of ligand-responsive regulatory proteins whose inappropriate 
expression in liver seems to correlate with the hepatocellular 
transformed state. 

Because the known receptors control the expression of target 
genes that are crucial for cellular growth and differentiation, 
an altered receptor could participate in the cell transformation. 
In that sense, avian v-erbA oncogene, which does not by itself 
induce neoplasms in animals, potentiates the erythroblast trans- 
formant effects of v-erbB and other oncogenes of the src family 
(Kahn et al., 1986). It has been shown that the v-erbA protein 
has lost its hormone-binding potential (Sar^et_al., 1986), pre- 
sumably as a result of one or several mutations it has accumu- 
lated in its putative ligand-binding domain. It has been also 
suggested (Edwards ^et al., 1979) that the growth of human breast 
tumors are correlated to the presence of significant levels of 
ER. This invention may provide a novel example in which a DNA- 
binding protein would again relate to the oncogenic transforma- 
tion by interfering with the transcriptional regulation of target 
genes. DNA-transf ect ion assays using the native hap cDNA as well 
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as 'altered 1 hap genes derived from various HCC can provide im- 
portant information concerning any transforming capacity. 
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